AITopics | zap q-learning

algorithm, approximation, function approximation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.45)

Add feedback

Zap Q-Learning With Nonlinear Function Approximation

Neural Information Processing SystemsDec-24-2025, 13:57:06 GMT

Zap Q-learning is a recent class of reinforcement learning algorithms, motivated primarily as a means to accelerate convergence. Stability theory has been absent outside of two restrictive classes: the tabular setting, and optimal stopping. This paper introduces a new framework for analysis of a more general class of recursive algorithms known as stochastic approximation. Based on this general theory, it is shown that Zap Q-learning is consistent under a non-degeneracy assumption, even when the function approximation architecture is nonlinear. Zap Q-learning with neural network function approximation emerges as a special case, and is tested on examples from OpenAI Gym. Based on multiple experiments with a range of neural network sizes, it is found that the new algorithms converge quickly and are robust to choice of function approximation architecture.

name change, nonlinear function approximation, zap q-learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Zap Q-Learning

Neural Information Processing SystemsNov-21-2025, 15:06:05 GMT

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original algorithm and recent competitors in several respects. It is a matrix-gain algorithm designed so that its asymptotic variance is optimal. Moreover, an ODE analysis suggests that the transient behavior is a close match to a deterministic Newton-Raphson implementation. This is made possible by a two time-scale update equation for the matrix gain sequence. The analysis suggests that the approach will lead to stable and efficient computation even for non-ideal parameterized settings. Numerical experiments confirm the quick convergence, even in such non-ideal cases.

electronic proceedings, name change, zap q-learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Q_nonlinear_camera_ready

Shuhang Chen

Neural Information Processing SystemsOct-9-2025, 15:23:25 GMT

algorithm, approximation, function approximation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Review for NeurIPS paper: Zap Q-Learning With Nonlinear Function Approximation

Neural Information Processing SystemsMay-31-2025, 18:36:28 GMT

Summary and Contributions: This paper introduces a version of Zap Q-learning that can be applied to arbitrary approximation architectures for Q-functions. Convergence analysis is undertaken, and a version of the algorithm with MLP function approximators is applied to several classical control tasks. POST-REBUTTAL ------------------------ I thank the authors for their response. I appreciate the comments around reorganisation of material, and clarification of some of the technical points I raised. There are two main concerns that I have with the paper that prevent me from strongly recommending acceptance, described below.

experiment, nonlinear function approximation, zap q-learning, (9 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.06)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.40)

Add feedback

Review for NeurIPS paper: Zap Q-Learning With Nonlinear Function Approximation

Neural Information Processing SystemsMay-31-2025, 18:34:34 GMT

The reviewers are generally supportive of the paper. They have provided some very useful feedback, and I highly encourage the authors to incorporate that feedback. Primarily, it would be ideal to complete the paper reorganization as discussed, explain the limitations in the assumption on boundedness of the iterates, provide a toy example where the boundness assumption is not on its own enough to prevent divergence of Q-learning (i.e, even under that assumption, Q-learning diverges but Zap-Q does not) and finally to sweep over the parameters in the empirical comparison (even if that means the outcome is less positive for Zap-Q).

neurips paper, nonlinear function approximation, zap q-learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.40)

Add feedback

Reviews: Zap Q-Learning

Neural Information Processing SystemsJan-20-2025, 04:49:48 GMT

The paper proposes a variant of Q-learning, called Zap Q-learning, that is more stable than its precursor. Specifically, the authors show that, in the tabular case, their method minimises the asymptotic covariance of the parameter vector by applying approximate second-order updates based on the stochastic Newton-Raphson method. The behaviour of the algorithm is analised for the particular case of a tabular representation and experiments are presented showing the empirical performance of the method in its most general form. This is an interesting paper that addresses a core issue in RL. I have some comments regarding both its content and its presentation.

review, update equation, zap q-learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Zap Q-Learning With Nonlinear Function Approximation

Neural Information Processing SystemsOct-11-2024, 07:09:37 GMT

Zap Q-learning is a recent class of reinforcement learning algorithms, motivated primarily as a means to accelerate convergence. Stability theory has been absent outside of two restrictive classes: the tabular setting, and optimal stopping. This paper introduces a new framework for analysis of a more general class of recursive algorithms known as stochastic approximation. Based on this general theory, it is shown that Zap Q-learning is consistent under a non-degeneracy assumption, even when the function approximation architecture is nonlinear. Zap Q-learning with neural network function approximation emerges as a special case, and is tested on examples from OpenAI Gym.

function approximation architecture, nonlinear function approximation, zap q-learning, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.30)

Add feedback

Zap Q-Learning

Devraj, Adithya M, Meyn, Sean

Neural Information Processing SystemsFeb-14-2020, 09:56:00 GMT

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original algorithm and recent competitors in several respects. It is a matrix-gain algorithm designed so that its asymptotic variance is optimal. Moreover, an ODE analysis suggests that the transient behavior is a close match to a deterministic Newton-Raphson implementation. This is made possible by a two time-scale update equation for the matrix gain sequence. The analysis suggests that the approach will lead to stable and efficient computation even for non-ideal parameterized settings. Numerical experiments confirm the quick convergence, even in such non-ideal cases.

algorithm, zap q-learning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

zap q-learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Q_nonlinear_camera_ready

Zap Q-Learning With Nonlinear Function Approximation

Zap Q-Learning

Q_nonlinear_camera_ready

Review for NeurIPS paper: Zap Q-Learning With Nonlinear Function Approximation

Review for NeurIPS paper: Zap Q-Learning With Nonlinear Function Approximation

Reviews: Zap Q-Learning

Zap Q-Learning With Nonlinear Function Approximation

Zap Q-Learning